-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JS: Support a taint tracking for arguments of .apply()
function call
#6559
base: main
Are you sure you want to change the base?
Conversation
Impressive work. For now you need to run the autoformatter on a bunch of files, that is what is causing the CI to fail.
And I definitely need to run some performance tests on this, and I unfortunately suspect that these tests will show significant performance regressions. I'm especially looking at the removal of Unfortunately I don't think I'll have time to take a close look at your PR this week, but I'll definitely spend some time on it Monday. |
Thanks, @erik-krogh!
I did this fix when added the tests in |
Performance doesn't look good, and it's caused by Lets look at the definition of that type. TApplyArgumentNode(MethodCallExpr ce, Function func, int i) { exists(func.getParameter(i)) } The way such a You can try it out yourself by running any data-flow query on the gatsby project. (E.g. So we need to do it some other way. An (untested) idea I have is to create a new dataflow node associated with each function. This new node would hold the "array" from an Something entirely different might also work, it was just a quick idea I had. Let me know if you need help with that, or if you want me to give it a go. The |
Thank you for the update, @erik-krogh! I understand now how |
If you go that route I think you'll end up with recursion between the callgraph and the definition of DataFlow nodes, which will likely result in non-monotonic recursion. I'm looking forward to see what you come up with. |
Hej @erik-krogh! I'm back to work on the PR. I did a fix to avoid a combinatorial explosion in I guess that your suggestion with I tested the last fix on the gatsby project and |
I don't think we have much documentation. There is some here, but I don't think it's that useful here. What I tend to do is have the option I'll kick of a performance evaluation, and then take a closer look at your code when that comes back. |
A performance evaluation came back, and it looks somewhat OK. Of the projects I benchmarked on, the bwip-js project had the biggest performance regression (about 30% when running the security suite). Your implementation looks good, given the algorithm you've decided to implement.
I've become more interested in exploring my |
Here is my experiment: 9d96260 With a I'll get a performance evaluation going on my own experiment. |
Thanks for sharing your approach. I will try to reproduce my tests on your code. At the first look, A good point about
Great! I started to work on the
I can investigate this issue later when we choose an approach. I'd like to ask how you run your performance benchmarks. Because I just run queries from VSCode and compare the time that is reported in the UI. Probably it is not the best way, I see about 10% difference from run to run on the same data. Probably you run it from CLI with cache invalidation, maybe with some other important parameters? |
We got some internal tools for measuring performance across many projects and queries. I usually use VSCode running once with a baseline and once with the PR i want to test. I sometimes fix the CPU frequency, otherwise thermals will have a big impact on performance (if you're using a laptop). When I then have two output logs, i usually open them both in VSCode side by side and start by looking at the |
That's very helpful. Thanks! |
Took a look at your code and tests in detail.
Looking forward to the perf tests results. I got some numbers from the gatsby project, it has 16 Seems we forgot about I also noticed weird taint propagation for |
I got some numbers from my experiment. The worst-case is still Good point about |
Draft-stating the PR due to inactivity, but feel free to undraft when you have implemented the planned changes. |
Implemented DataFlow::Node's for arguments of
.apply
function calls and propagation of tainted values from theargsArray
parameter to these arguments. Fixed small issues in the array initialization and improved unit tests, see arrays-init.js and call-apply.js.@erik-krogh, take a look at this PR. We discussed this one in Slack.